Skip to content

Add 5Hz LM planner (CoT-only) and generation-daemon integration#3

Open
Marenz wants to merge 4 commits intomasterfrom
spacebot-skill
Open

Add 5Hz LM planner (CoT-only) and generation-daemon integration#3
Marenz wants to merge 4 commits intomasterfrom
spacebot-skill

Conversation

@Marenz
Copy link
Owner

@Marenz Marenz commented Feb 23, 2026

Loads acestep-5Hz-lm-1.7B before the DiT to expand a raw user caption into structured metadata — BPM, key/scale, time signature, language, and a rewritten caption — which the DiT text encoder then conditions on instead of the raw input.

The LM runs in CoT-only mode (stops at </think>). Phase 2 audio code generation is tracked in #2.

The generation daemon gets a --use-lm flag. The Unix socket is now bound before model loading so it appears within ~1s of startup.

Wraps acestep-5Hz-lm-1.7B as a Qwen3 causal LM that runs before the DiT.
Generates until </think>, parses the YAML metadata block, and returns
structured BPM, key/scale, time signature, language, and a rewritten caption
to use as DiT conditioning instead of the raw user input.
Add --use-lm flag to load the LM planner on startup and run it before each
generation request. The LM output fills in BPM, key/scale, time signature,
and language; user-specified duration_s always takes priority over the LM
suggestion. The VRAM offload threshold is automatically lowered when the LM
is resident to avoid spurious CPU fallback.

Also bind the socket before model loading so the file appears within ~1s of
startup rather than after the full 30s load time.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant